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PCT/EP98/08162 
CONSTITUTIVE PLANT PROMOTERS 



FIELD OF THE INVENTION 

5 The invention is directed to new plant promoters, more 

specifically those promoters which can be produced by assembling parts 
of promoters which have a complementary specificity. 

BACKGROTJKD ART 

10 Genetic engineering of plants has become possible by virtue of 

two discoveries: first of all the possibility of transformation of 
heterologous genetic material to the plant cell (most efficiently done 
by the bacterium Agrobacterium tumefaclens or related strains) and 
secondly by the existence of plant promoters which are able to drive 
15 Che expression of said heterologous genetic material. 

A typical plant promoter consists of specific elements. A basis 
is formed by the minimal promoter element, which enables transcription 
initiation, often accompanied by a sequence, also denominated as the 
TATA-box, which serves as a binding place for transcription initiation 
20 factors. In most promoters, the presence of this TATA-box is important 

for proper transcription initiation. It is typically located 35 to 25 
basepairs (bp) upstream of the transcription initiation site. 
Another part of the promoter consists of elements which are able to 
interact with DNA-binding proteins. Known are G-box binding elements 
25 which are based on the hexanucleotide CACGTG motif. These elements 

have been shown to be able to interact with bZIP DNA-binding_proteins 
which bind as dimers (Johnson & McKnight, Ann. Rev. Biochem, 51, 7 99- 
839, 1989) . Other G-box related motifs, such as the Iwt and PA motifs 
have been described (WO 94/12 015) . 
30 These motifs have been shown to be involved in tissue-specific 

promoter expression in plants. For instance, presence of Iwt tetramers 
confer embryo -specific expression, while PA tetramers confer high 
level root expression, low- level leaf expression and no seed 
expression. 

35 Similarly, GT-1 like binding sites (grouped on basis of a moderate 

consensus sequence GGT*/^) are described. Such a binding site is found 
far upstream the promoter region of the Arabidopsis plastocyanm 
promoter and seems to be involved in activation of transcription 
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during light periods (Fisscher, U. ec ai . , Plant Mol . Biol. 26,, 873- 
886, 1994) . 

Another sequence -related phenomenon which is found often m 
5 plant promoters is the presence of sequences which enable the 

formation of Z-DNA. Z-DNA is DNA folded m a left-handed helix which 
is caused by repeats of dinucleotides GC or AC. It is believed that 
folding m a 2 -form influences the availability of the DNA for 
approach by RKA polymerase molecules, thus inhibiting the 
10 transcription rate. 

One of the earliest and most important inventions in the field 
of plant protein expression is the use of (plant) viral and 
Ag-robacteriuiH- derived promoters that provide a powerful and 
15 constitutive expression of heterologous genes in transgenic plants. 

Several of these promoters have been used very intensively in plant 
genetic research and still are the promoter of choice for rapid, 
simple and low- risk expression studies. The most famous are the 35S 
and 19S promoter from Cauliflower Mosaic Virus (CaMV) , which was 
20 already found to be practically useful in 1984 (EP 0 131 623) , the 

promoters which can be found in the Agrobacterium T-DNA, like the 
nopaline synthase (nos) , mannopine synthase (mas) and octopine 
synthase (ocs) promoters (EP 0 122 791, EP 0 126 546, EP 0 145 338). A 
plant-derived promoter with similar characteristics is the ubiquitin 
25 promoter (EP 0 342 926) . 

In time, several attempts have Deen made to increase the level 
of expression of these promoters. Examples for this are the double 
enhanced 3 5S promoter (US 5,164,316) and, more recently, the 
superpromoter , which couples parts of the Agrobacterium promoters (EP 
30 729 514) . 

However, in many cases these promoters do not fulfill the 
criteria of an ideal promoter. All promoters described above show a 
clear pattern of organ- or developmental - specif ic expression, and 
frequently the pattern of expression found with these promoters is not 
35 ideal for some applications. Especially for biotechnological 

applications like the engineering of fungal and insect resistance, 
which require expression both in the right location as well as in the 
right timeframe of plant development there is a need for new 
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constitutive promoters which are able to give a high level of 
transgene expression at exactly the right time and place. 

SUMMARY OF THE I13VENTI0N 

5 The invention provides for novel plant promoters, characterized 

in that they comprise 1) a minimal promoter and 2) transcription- 
activating elements from a set of promoters, which elements direct a 
complementary pattern and level of transcription in a plant. 

More specifically, this plant promoter is a constitutive 

10 promoter in which each of the transcription-activating elements do not 

exhibit an absolute tissue-specificity, but mediate transcriptional 
activation in most plant parts at a level of >1% of the level reached 
in the part of the plant m which transcription is most active. An 
example of such promoter pairs is a set of promoters in which one is 

15 most active in green parts of the plant, while the other promoter is 

most active in underground parts of the plant. More specifically the 
new promoter is a combination of the ferredoxin and the RolD promoter. 
Preferably in this construct the minimal promoter element is derived 
from the ferredoxin promoter and the ferredoxin promoter is derived 

20 from Arabidopsis thallana. The rolD promoter is derived from 

AgrroJba cterium rhizogenes , 

Also part of the invention is a plant promoter which is a 
combination of the plastocyanin and the S-adenosyl -methionine -1 
promoter, whereby preferably the minimal promoter element is derived 

25 from the S -adenosyl -methionine- 1 promoter and both the plastocyanin 

promoter and the S-adenosyl -methionine- 1 promoter are derived from 
Arabidopsis thalxana . 

Further part of the invention are chimaeric gene constructs for 
the expression of genes in plants comprising the above disclosed 

30 promoters . 

DESCRIPTION OP THE FIGURES 

Figure 1: Schematic representation of pMOG410 and pMOG1059 

55 Figure 2: Distribution of GUS expression of potato lines transformed 

with the constructs pMOG1059 en pMOG410. GUS staining was 
judged visually and classes of expression, relative to the 
highest GUS expression measured in our lab (set at 4) . A 
value of zero indicates no visible expression. 
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Figure 3 ; Graphic represencation of the average expression of GUS 

enzyme m primary trans formants of tomato, oilseed rape and 
potato. GUS expression was deterromed visually and compared 
to a high level expressing 3 5S GUS transgenic tobacco plant 
ranking a score of 4 . Standard deviation of the measured 
values are indicated on each of the bars . 

Figure 4 : Graphic representation of the distribution of potato plants 
with various levels of GUS expression containing SAM-1-, Pc- 
35S- and PcSAMl-GUS constructs- Scored are expression in 
leaf mesophyll, leaf vascular system, stem and root. 



DETAILED DESCRIPTION OF THE INVENTION 

For the purpose of this specification the following definitions 
are valid: 

A promoter consists of an RNA polymerase binding site on the DNA, 
forming a functional transcription initiation start site. A promoter 
usually consists of at least a TATA box and possibly of other 
sequences surrounding the transcription initiation site (initiator) 
and can either be used isolated (minimal promoter) or linked to 
binding sites of transcription-activating elements, silencers or 
enhancers that may enhance or reduce transcription initiation rates, 
and which may function respective of developmental stage, or external 
or internal stimuli. 



The initiation site is the position surrounding the first nucleotide 
which is part of the transcribed sequence. v;hich is also defined as 
position +1. VJith respect to this site all other sequences of the gene 
and its controlling regions are numbered. Downstream sequences (i.e. 
further protein encoding sequences in the 3' direction) are 
denominated positive, while upstream sequences (mostly of the 
controlling regions in the 5' direction) are denominated negative. 

A minimal promoter is a promoter consisting only of all basal elements 
needed for transcription initiation, such as a TATA-box and/or 
initiator , 



An enhancer is a DNA-eiement which, when present in the neighbourhood 
of a promoter is able to increase the transcription initiation rate. 
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A promoter is constitutive when it is able to express the gene that it 
controls in all or nearly all of the plant tissues during all or 
nearly all developmental stages of the plant. 

5 Specific expression is the expression of gene products which is 

limited to one or a few plant tissues (spatial limitation) and/or to 
one or a few plant developmental stages (temporal limitation) . It is 
acknowledged that hardly a true specificity exists: promoters seem to 
be preferably switch on in some tissues, while in other tissues there 

10 can be no or only little activity. This phenomenon is known as leaky 

expression- However, with specific expression in this invention is 
meant preferable expression in one or a few plant tissues. 



The expression pattern of a promoter (with or without enhancer) is the 
15 pattern of expression levels which shows where in the plant and in 

what developmental stage transcription is initiated by said promoter. 

Expression patterns of a set of promoters are said to be complementary 
when the expression pattern of one promoter shows little overlap with 
20 the expression pattern of the other promoter. 



The level of expression of a promoter can be determined by measuring 
the 'steady state' concentration of a standard transcribed reporter 
mRNA. This measurement is indirect since the concentration of the 

25 reporter mRNA is dependent not only on its synthesis rate, but also on 

the rate with which the mRNA is degraded. Therefore the steady state 
level is the product of synthesis rates and degradation rates . The 
rate of degradation can however be considered to proceed at a fixed 
rate when the transcribed sequences are identical, and thus this value 

30 can serve as a measure of synthesis rates. When promoters are compared 

in this way techniques available to those skilled in the art are 
hybridisation Sl-RNAse analysis, Northern blots and competitive RT- 
PCR. This list of techniques in no way represents all available 
techniques, but rather describes commonly used procedures used to 

35 analyse transcription activity and expression levels of mRNA. 



One of the technical difficulties encountered in such an analysis is 
that the qualitatively best results can only be obtained by fusing 
transcriptional activating parts to the reporter RNA molecule, m such 
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a way that only reporter sequences are transcribed. This requires the 
exact determination of the RNA synthesis start, and joining at that 
point the sequences of the reporter mRNA. 

5 This is important for a number of reasons. First, the analysis of 

transcripion start points in practically all promoters has revealed 
that there is usually no single base at which transcription starts, 
but rather a more or less clustered set of initiation sites, each of 
which accounts for some start points of the mRNA. Since this 

10 distribution varies from promoter to promoter the sequences of the 

reporter mRNA in each of the populations would differ from each other. 
Since each mRNA species is more or less prone to degradation, no 
single degradation rate can be expected for different reporter mRNAs . 
Secondly, it has been shown for various eukaryotic promoter sequences 

15 that the sequence surrounding the initiation site ('initiator') plays 

an important role in determining the level of RNA expression directed 
by that specific promoter. This includes also part of the transcribed 
sequences . The direct fusion of promoter to reporter sequences would 
therefore lead to much suboptimal levels of transcription. 

20 

Leaving in these transcribed sequences does allow determining the 
transcription rates, but potentially alters the stability of the 
reporter mRNA and influences translation initiation rates of an 
eventual open reading frame. 

25 

The role of this analysis, however, is the determination of the 
relative level of constitutive expression of a heterologous protein, 
as is the most frequent used application in biotechnology. Therefore 
the most important parameter is the ability of the tested sequences to 
30 drive high level expression of a heterologous reporter protein. 

This would involve coupling the coding sequences of a reporter protein 
to the transcription activating part, promoter and 5' untranslated 
sequence of the gene which is tested for its properties. In this way a 
35 complex set of effects (combining transcription rates, mRNA stability 

(and thus degradation rates of the mRNA) and translational initiation 
rates) is reduced to one value that is a very useful value for 
determining usefulness of the tested gene elements in biotechnological 
applications . 
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There as no current word or phrase to describe this value. In the 
course of this application next to the term 'expression value' the 
terms 'expression level' and * transcriptional activity' are used. We 
realize that this may cause some confusion. In all cases we do 

5 indicate with these and related terms the value just mentioned. 

A commonly used procedure to analyse expression patterns and levels is 
then through determination of the 'steady state' level of protein 
accumulation in a cell. Commonly used candidates for the reporter 
gene, known to those skilled in the art are ^-glucuronidase (GUS) , 

10 Chloramphenicol Acetyl Transferase (CAT) and proteins with fluorescent 

properties, such as Green Fluorescent Protein (GFP) from Aeguora 
victoria. In principle, however, many more proteins are suitable for 
this purpose, provided the protein does not interfere with essential 
plant functions . For quantification and determination of localization 

15 a number of tools are suited. Detection systems can readily be created 

or are available which are based on e.g. immunochemical, enzymatic, 
fluorescent detection and quantification. Protein levels can be 
determined in plant tissue extracts or in intact tissue using in situ 
analysis of protein expression. 

20 Generally, individual transformed lines with one chimeric promoter- 

reporter construct will vary in their levels of expression of the 
reporter gene. Also frequently observed is the phenomenon that such 
transf ormants do not express any detectable product (RNA or protein) . 
The variability in expression is commonly ascribed to 'position 

25 effects' although the molecular mecnanisms underlying this inactivity 

are usually not clear. 

The term average expression is used here as the average level of 
expression found in all lines that do express detectable amounts of 
reporter gene, so leaving out of the analysis plants that do not 

30 express any detectable reporter mllNA or -protein. 

Root expression level indicates the expression level found in protein 
extracts of complete plant roots. Likewise, * leaf and 'stem 
expression levels' are determined using whole extracts from leaves and 
stems. It IS acknowledged however, that within each of the plant parts 

35 just described, cells with variable functions may exist, in which 

promoter activity may vary. 

For the promoters described in this application the expression levels 
in large plant parts, containing cells with various functions, are 
measured. However, more detailed analyses may contribute to 
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construction of a promoter that is even *more constitutive' taking 
into account that more celltypes within a plant part are taken into 
account , 

As a standard for judging expression levels the 35S promoter of the 
5 Cauliflower Mosaic Virus is a convenient and widespread used standard. 

The average expression level of this promoter may be classified as 
medium high. 

The invention shows that it is possible to combine elements from one 
10 promoter which are responsible for a specific expression with elements 

from another promoter which are responsible for a complementory 
expression pattern to form a promoter which - as a result - shows 
expression in the tissues and developmental stages which form part of 
the expression pattern of both promoters. If the complementation 
15 results in activity in (nearly) all the cells of the plant, such 

complementation will yield a constitutive promoter. It seems to be 
necessary, however, that both promoters have a low expression value in 
the tissues and developmental stages which are specific for the other 
promoter. It has been established that, for being suitable, the 
20 transcriptional activity in the plant parts where expression is low 

should be preferably >1% of the level of transcription which is 
reached in the plant parts where transcriptional activity is high. 

This limits the availability of promoters and promoter elements 
25 from which to build a new constitutive promoter. A suitable promoter- 

pairs which fulfills the above mentioned criteria is : 

the ferredoxin promoter in combination with the rolD promoter 
the S-adenosyl methionine promoter m combination with the 
plastocyanin promoter 
30 Other promoter-pairs which are complementary and which show at least 

some expression m the tissues and developmental stages which are 
specific for the other promoter can also be applied. 

Delineation of promoter and/or enhancer parts needed. 

35 Whereas transcription-regulating elements, especially m eukaryotes, 

may be present at large distances from the promoter/ transcription 
initiation site, and located both downstream or upstream of the 
initiation site, many plant genes have most of their regulatory 
elements in the area directly upstream of the promoter. In order to 
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identify the main transcription-activating elements of promoters it is 
common procedure to link parts of the non- transcribed areas that are 
found upstream (and downstream) of the promoter to a reporter gene, to 
analyse the ability of each of the truncated DNA elements to direcr 
expression of that reporter. For delineation of more promoter -proximal 
sequences involved in transcription regulation, fragments of the 
enhancer sequences are most commonly coupled to a promoter, which may 
be derived from the gene of which transcription regulation is studied. 
Alternatively, a heterologous promoter can be used such as the 
sequences of the 35S promoter from -46 to +4, relative to the 
transcription start, which is functionally coupled to a reporter gene 
as described above. 

In this way it is possible to delineate the transcription activating 
elements of most genes, a process that is well-known to those skilled 
in the art. 

A large number of transcription regulatory elements of genes have been 
analysed in such a manner, and data relevant for this analysis are 
directly available to those skilled in the art through scientific 
publications , 

Transcription activating elements that on average can direct 
expression to approximately the average level of the 3 5S promoter (at 
least 50% of this level) in at least some of the plant parts, and that 
are also capable of directing at least 0.5% (of the 35S level) 
transcription in other plant parts are then selected for further use. 

The minimal promoter element is typically derived from one of 
the promoters of the promoter-pair, although not necessarily. It can 
be envisaged that such a minimal element is derived from a third 
promoter or is even made synthetically - 

Based on the results of the analysis described above, 
transcription activating parts with complementary activities are 
selected. That is, for example, a promoter with expression throughout 
the plant, transcription activating DNA fragments that direct high 
level root expression and with lower leaf and stem expression levels, 
are combined with elements that direct expression mainly in the leaf 
and stem, but lower in the root. Other combinations of complementary 
transcription activating parts are obvious. 

Preferentially, the level of expression m the parts where expression 
is lowest does not fall below 1% of the level obtained in the nignest 
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part. More preferred is the situation where the relation between 
lowest expression and highest expression between plant parts is larger 
than 5% . 

This coupling can most easily be done by known genetic 
5 engineering techniques. The gene which has to be expressed by the new 

constucive promoter can be cloned behind the promoter. It is 
adviseable to build in a unique Ncol -cloning site at the linkage of 
the 5' untranslated sequence attached to the promoter to allow precise 
junction of the open reading frame (ORF) and the 3' end of the 
iO promoter in which the gene of interest can be inserted. 

The f erredoxin-rolD pair. 

One of the preferred combinations of the present invention is a 
constitutive plant promoter comprising elements of both the ferredoxin 

15 promoter and the rolD-promoter . Preferably the ferredoxin promoter is 

obtained from Arabidopsis thallana where it drives the ferredoxin A 
gene, a gene which is involved in the photosynthesis. The expression 
of this gene and the responsiveness of its promoter to light has been 
reported (Vorst, 0. et al.. Plant Mol . Biol. 14, 491-499, 1990; Vorst, 

20 O. et al., The Plant J. 3(6), 793-803, 1993; Dickey, L.F. et al . , The 

Plant Cell 6, 1171-117 6, 1994) . Since the ferredoxin gene is involved 
in photosynthesis the promoter is most active in green tissue. mRNA 
levels were shown to be high in chloroplast-containing organs such as 
stem, leaves and bracts, but also in young growing tissues, such as 

25 whole flowers and seedlings. Interestingly, there is a smaller, but 

significant expression in soilborne areas of the plant. The promoter 
sequence contains both a G-box and an I -box containing region. Also a 
potential Z -folding DNA sequence is found at position -182. 

The rolD promoter is reported to have strong expression in the 

30 roots and is obtainable from Agrobacterium rhizoqenes . Although the 

source organism is a bacterium, the promoter is very suitable for 
expression in plants because the bacterium is a phytopathogen which 
causes hairy-root disease m plants. For that purpose it transfers DNA 
to the plant amongst which the rolD gene is responsible for root 

35 elongation. To be expressable in plants this gene needed a strong 

promoter functional in plants, the rolD promoter. GUS -studies have 
shown that expression under control of the rolD-promoter yields mainly 
root-specificity (Leach, F. and Aoyagi, K., Plant Sci . 79, 69-76, 
1991) . Also, some expression in. leaves was observed. 



10 



wo 99/31258 



PCT/EP98/08162 



A combination of the ferredoxin and the rolD promoter can be 
obtained in two ways, depending on from which promoter the minimal 
promoter element and 5' untranslated sequences will be taken. In our 
examples we have used the minimal promoter element from the ferredoxin 
5 promoter, but deriving it from the rolD promoter is equally well 

possible . 

The S-adenosyl-methonine synthetase and plastocyanin pair. 

Another favorable promoter can be obtained from a combination of 

10 the S-adenosyl -methionine synthetase (SAM) promoter and specific parts 

of the plastocyanin promoter. Preferably, both promoters are obtained 
from Arahidopsis thallana. 

The SAM promoter regulates the expression of S-adenosyl- 
methionine synthetase, which is an enzyme active in the synthesis of 

15 polyamines and ethylene. Promoter studies showed a strong expression 

in vascular tissues, in callus, sclerenchyma and some activity in root 
cortex (Peieman, J. et ai . , The Plant Cell 1, 81-93, 1989) which was 
reasoned to be due to the involvement of the enzyme in lignif ication , 
The plastocyanin promoter, like the ferredoxin promoter, is also 

20 a promoter which is active in the photosynthetic pathway. mRNA levels 

are high in green, chloropiast-containing structures, such as leaves, 
cauline leaves, stem and whole seedling. Also in flowers the promoter 
is very active. Little expression is detectable in silique, seed and 
root {Vorst, O. et al . , The Plant J. 4(6), 933-945, 1993). 

25 By combining these specificities it is possible to create a chimeric 

promoter that drives good expression both in the photosynthetic areas 
of the leaf and stem, as well as in the area's not involved in 
photosynthesis, such as the cells forming and surrounding the vascular 
system in leaves and stems. 

30 

Other pairs of promoters. 

The above given examples of promoter-pairs show in both cases 
the presence of a promoter which is active during photosynthesis. It 
is envisaged that other promoters which are regulating expression of a 
35 gene needed for photosynthetic activity may be suitable for a 

combination with either the rolD or other root-preferential promoters. 

In the construction of a promoter that drives expression 
throughout the plant: if one of the components is a promoter which is 
more or less specific for green parts, this automatically means that 
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the other proiuocer of the pair should be predominantly (but not 
exclusively) expressed in the roots and other non-photosynthesizing 
organs . 

In the construction of a promoter that drives expression in ail parts 
5 of leaves and stems, the combination may be made by using a promoter 

which is more or less specific for green parts and a promoter which 
drives expression primarily m the vascular system. 

However, the invention is not limited to the combination of a 
10 root -preferential and a green part -preferential promoter, and a 

combination of green-part-preferential and vascular system- 
preferential promoters. All promoter combinations provided that the 
expression patterns of the individual promoters are complementary can 
be used. 

15 It is also possible that the elements from which e.g. a new 

constitutive promoter is composed are derived from a set with more 
than two promoters. The above discussed complementarity should then 
also exist . 
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EXPERIMENTAL PART 
Example 1 

Cloning of the chimeric Fd-rolD promoter: 

5 A 512 bp Arabidopsis thaliana ferredoxin promoter fragment (0. Vorst 

et al,, 1990, PMB 14, 491-499.) ranging from position -512 to +4 
(relative to the ATG startcodon of the ferredoxin Open Reading Frame) 
was isolated by digestion with Hindi and Ncol . This fragment contains 
most of the transcriptional regulatory sequences of the ferredoxin 

10 promoter, the promoter sequences and leader of the ferredoxin 

transcript. An Xbal site was introduced, for cloning reasons, at 
positions -5 to -10 relative to the ATG CO. Vorst et al . , 1990, PMB 
14, 491-499.), This changes the original sequence of the clone at this 
point from ACAAAA to TCTAGA (SEQ ID NO: 1) . 

15 Part of the Agrobacterium rhlzogeriBs rolD upstream sequences (SEQ ID 

NO: 2) (Leach et al . , 1991 Plant Sci. 79, 69-76) were fused to the 
ferredoxin promoter sequences described above. A Hindlll-Rsal 
fragment, comprising nucleotides -3 35 to -86 relative to the 
initiation codon was cloned next to the ferredoxin fragment, joining 

20 the Rsal sites of the latter with the Hindi site of the former. 

This chimaeric element, containing the promoter and some of the 
activating sequences of the ferredoxin gene, and upstream activating 
sequences of the rolD gene was used in subsequent studies as to its 
transcription-stimulating properties (SEQ ID NO: 3). 

25 

Example 2 
GUS- fusions 

The Fd-rolD chimaeric promoter/activator was coupled to the GUS gene, 
engineered to contain an intron gene (Jefferson et al . , (1987) emBO J 

30 6: 3901-3 907) . The Ncol restriction site on the ATG start codon was 

used to join the promoter to the Open Reading Frame (ORF) of the GUS 
gene, coupled to a 265 bp fragment containing the Proteinase Inhibitor 
II 3' untranslated and transcriptional termination sequences 
(Thornburg et al . , 1987, Proc . Natl. Acad. Sci. USA 8_4, 744-748; An et 

35 ai.. Plant Cell 1, 115-122 ). 

The whole expression cassette, containing the promoter, GUS gene and 
3' PI-II sequences was cloned out using BamHl and EcoRI and introduced 
into the binary vector pMOGBOO (deposited at the Centraai Bureau voor 
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Schimme i cultures , Baarn, The Netherlands, under CBS 414.93, on august 
12, 1993) digested with the same enzymes. The subsequently made 
construct (pMOG105 9) was used in transformation experiments with 
various plants. As a control a 35S CaMV promoter-GUS construct was 
used. This is construct pMOG410. A schematic representation of both 
constructs is found in figure 1. 

Exeunple 3 

Expression levels and patterns of promoter activity during early 
stages of plant transformation 

First, Arabidopsis thaliana transf ormants were made with both 
constructs and GUS expression was followed in time during the 
transformation procedure. 

GUS expression levels were determined visually, on a scale of 0 to 5, 
where 0 is no detectable expression and 5 is the highest level of GUS 
we have observed in leaves of a transgenic plant, of a rare tobacco 
35S-GUS- transgenic (line 96306) . Samples from leaves of this plant 
were included in all experiments for internal reference. 
In table l the relative GUS expression in Arahidopsis thaliana 
explants is indicated, at several times after Agrobacteriu/n 
tumef aci ens cocultivation (DAC; days after cocultivation) 

Table 1. relative GUS activity of Arabidopsis root explants. 



Construct ; 


pMOG1059 


PMOG410 


Time of 






assay 






DAC 0 


2 


3 


DAC 2 


3 


3 


DAC 5 


3 


2 


DAC 7 


4 


3 


DAC 9 


4 


3 


DAC 12 


4 


3 



AS can be seen from this comparison, GUS expression driven by the 
chimaeric promoter starts slightly later after cocultivation but from 
day 7 on, exceeds the level of expression obtained with the reference 
35S promoter- 

Very similar data were obtained when' Brassica napus explants were 
scored for GUS expression. At day 5 after co-cultivation the 35S 
promoter is slightly higher, but the situation is reversed on day 20 
after co-cultivation. Also for tomato similar data were obtained. Here 



14 



wo 99/31258 



PCT/EP98/08162 



even at the earliest stage of analysis expression of pMOGlCSS- 
transgenics exceeded that of pMOG410 transgenics. 

Example 4 

Expression levels and patterns in in vitro grown plants 

When plants are grown up further, differences between these promoters 
become ever clearer. Leaf samples of fully regenerated plants were 
analysed for GVS expression. Averages were obtained from 11-4B plants, 
dependent on the construct. 

For Arahldopsls zhalla^na that was grown in vitro only, no large 
difference was obseirved between GVS expression in pMOG1059 and 
pMOG4 10 -transgenics . 

Table 2 . Average relative GUS activity of leaf samples of all tested 
crops , 



construct : pMOG10 5 9 pMOG4lO 

Crop : 

Potato 4.0 2.1 

Brassica napus 3.7 2.8 

Arabidopsis 4.0 4,1 

Tomato 2.2 2.1 



What is also clear from the data presented in figure 2 that a 
significant number of 35S-GUS transgenic lines {app. 50% was found 
repeatedly in our experiments) do not express GUS to a level that it 
is visible. So not only maximum and average expression are higher in 
the Pd-rolD-GUS transgenics, also the frequency with which transgenic 
plants do express GUS is strongly enhanced. In about 5 0 transgenic 
potato plants carrying the Fd-rolD-GUS construct, we have found no 
weak expressor, suggesting a reliable high expression in at least 98% 
of the lines made. 

Example 5 

Comparison of promoter performance in various crops 

Constructs pMOG410 (35S-GUS) and pMOG1059 ( Fd-rolD-GUS ) were also 
introduced into oilseed rape and tomato for a furtner comparison of 
promoter performance. Also the data for potato are included here. 
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As shown in Figure 3A, in tomato the overall level of expression of 
the Fd-rolD promoter is higher both at the latest stage of in vitro 
growth as well as in leaves of 4 and 7 week old plants. Also m stems 
of 7 weeks this holds true, however, for roots, an average weaker 
expression is observed with the Fd-rolD promoter than for the 35S 
promoter . 

Also in oilseed rape and potato, similar results are obtained, with 
the notable exception that in potato roots the level of expression by 
the Fd-rolD promoter exceeds that of the 3 5S promoter. As shown in 
figures 3B and 3C both the average expression of the Fd-rolD promoter 
is higher and also the variation in expression is significantly lower. 
In conclusion we can say we have created a promoter that withstands 
the comparison with the 35S promoter easily in three major crops. 

Example 6 
Expression of nptll transgene. 

In order to also check usability of the Fd-rolD promoter for other 
purposes, the promoter was linked to the nptll gene, of which 
expression of the corresponding gene product confers resistance in 
plants to the antibiotic kananiycin. This element was placed between 
the left and right borders of the T-DNA allowing Agrobacteriuw 
tumefaci ens -mediated transfer to plants. As a control, similar 
constructs in which the expression of the nptll gene was under control 
of the nos promoter were used. 

The resistance to kanamycin in transgenic potato plants is manifested 
by the development of transgenic calli and shoots during a standard 
transformation procedure, in which kanamycin is used in the culture 
medium. 

On average, for the constructs with the nos-nptll selction cassette, 
the transformation frequency for potato is 45%, for constructs with 
the Fd-rolD-nptll selection cassette the frequency is on average 61%. 
While we do not know at this moment how relevant the increase in 
transformation frequency is for this construct, it indicates that the 
Fd-rolD promoter is at least as suitable for driving a heterologous 
gene such as nptll, as commonly used constitutive promoters such as 
nos . 
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Example 7 

Comparing visual scoring to quantitative values 

From the analysis of GUS expression based on 1 ) histochemical analysis 
5 and scoring to an internal control and 2) to a quantitative analysis 

of GUS enzymatic activity^ we have learned that both give a 
reproducible quantitative figure. A thorough analysis of both scores 
for tomato and oilseed rape leaves and roots, leads to the conclusion 
that scale 3, which compares best to that of 35S, equals about 2 000 

10 pmol MU/minute.mg made in the quantitative analysis. In scale 1 and 2, 

averages are 1000 and 1500, respectively, which set the value of 50% 
of 35S. About 1% expression of that level equals 100 pmol 
MU/minute.mg, which is frequently under the detection level for 
histochemical detection , although sometimes detectable as very light 

15 blue staining due to GUS expression. Therefore one can use 

histochemical staining as a marker for promoter efficacy, by measuring 
the level of blue staining, and use these data to select promoter 
elements of use, 

20 Example 8 

Construction of the SAMl promoter and fusion to GTTS. 



For the construction of the SAMl promoter genomic DNA (SEQ ID NO: 4} 
was isolated from Arabidopsis challana Landsberg erecta leaves using a 

25 CTAB extraction procedure. Primers were designed based on the 

published sequence of the SAMl gene from Arahldopsis thallana K8 5 
(Peleman et al . , (1989) Gene 84, 359-369) . In a PGR (30 cycles of 45 
seconds 95 *^C, 45 seconds 5 0°C and 1' 72°C; same program was used in 
all other PCR's described in this part) the promoter element was 

30 amplified using primers FR-Psam-143 5' AGA TTT GTA TTG CAG CGA TTT CAT 

TTT AG 3' {SEQ ID NO : 5) and FR-Psam-216 5^ ATC TGG TCA CAG AGC TTG TC 
3' (SEQ ID NO: 6) yielding a fragment of about 550 bp. The DNA 
fragment was isolated from an agarose gel and cloned into the pGEM-T 
vector (Promega Corp., Madison WI, USA) . This clone was used as a 

35 template to introduce a Nco I site at the translation start by PGR 

using primers FR-Psam-144 5' GTC TCC ATG GTG CTA CAA AGA ATA G 3' 
{SEQ ID NO: 7) and FR-Psam-143 . The resulting 500 bp fragment: was 
cloned in the pGEM-T vector. The EcoR I and Hind III sites located in 
the promoter regaon were removed by PGR m two steps using this clone 
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as a template. In this PGR a BamH I site was introduced upstream of 
the SAMl promoter and a Hind III site was introduced at the 3' site of 
the promoter. In the first PGR step three promoter fragments were 
generated. The first fragment (1) will contain the 5' BaniB I site and 
5 the mutated EcoR I site using primers FR-Psam-248 5' CGG GAT CCT GCA 

GCG ATT TCA TTT TAG 3' (SEQ ID NO : 8) and FR-Psam-249 5' ACA TGA AGG 
AAT GCA AAA TCT C 3' (SEQ ID NO: 9) . The middle fragment (2) is 
obtained with primers FR-Psam-25 0 5' AGA TTT TGC ATT CGT TCA TGT G 3' 
(SEQ ID NO: 10) and FR-Psam-251 5' TGT AAG CAT TTC TTA GAT TCT C 3' (SEQ 

10 ID NO: 11) . This fragment has a partial overlap with fragment 1 and 3 

and has mutated £coR I and Hind III sites. The third PCR fragment (3) 
will contain the mutated internal Hind III site and introduces a Hind 
III site at the 3' end of the promoter encompassing the Nco I site at 
the translation start and is generated using primers FR-Psam-252 5' 

15 AAG AAA TGC TTA CAG GAT ATG G 3' (SEQ ID NO: 12) and FR-Psam-253 5^ 

GAC AAG CTT GAT CCC ATG GTG CTA CAA AGA ATA G 3' (SEQ ID NO : 13 ) . In a 
second PCR the 3 fragments 1,2 and 3 were mixed together in one tube 
and amplified with primers FR-Psam-248 and FR-Psam-253. Due to the 
overlap between fragments 1 and 2, and 2 and 3, this PCR yields the 

20 complete mutated promoter. After digestion with BantH I and Hi-nd III 

the resulting SAMl promoter was cloned in a pBSK+ vector. The SAMl 
promoter was then cloned into a vector containing a GUSintron-TPI-II 
reporter cassette by exchanging the upstream region using the SamH I 
and Nco I restriction sites. This was done by digestion of the SAMl 

25 clone with BaxriH I and Nco I and isolation of the promoter fragment 

from a agarose gel. The GUS vector was digested with the same enzymes 
and the vector was then isolated from a agarose gel thus discarding 
the original upstream sequences promoter . 

The SAMl promoter-GUSintron-TPI-II reporter cassette was then cut out 
30 of the vector by BamH I and EcoR I digestion after which the reporter 

cassette was isolated from a agarose gel and cloned into the binary 
vector pMOGSOO digested with BamH I and EcoR I. The resulting binary 
vector pMOG1402 was introduced in Agrobaccerium tumefaciens strain 
EHA105 for transformation to potato. 
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Example 9 

Construction of the Pc-SAMl chimaeric promoter and fusion to the GXJS 
gene 

5 The plastocyanin enhancer (Pc) from Arabidopsis thaiiana Col-o was 

obtained by PCR. Therefore primer FR-Pc-146 (5'agt ggt acc ate ata 
ata etc ate etc ctt ca3') (SEQ ID NO: 14) and primer FR-Pc-247 (5'cga 
age ttt aca aat eta att tea tea eta aat egg a3 ' ) (SEQ ID NO: 15) were 
developed introducing a Kpn I restriction site upstream of the 

10 enhancer and a Hind III restriction site downstream of the 

plastocyanin enhancer. The PGR was performed using Cloned pfu DNA 
polymerase (Stratagene) for 30 cycles 1' 95°C, 1' 50°C , 4' 72°C and 
1 cycle 1' 95°C, 1' 50«C , 10' 72*=C. The resulting PGR fragment was 
ligated into a high copy cloning vector using Kpn I and Hind III 

15 resulting m construct pPM15,l. 

This clone was used as a template for a PGR (30 cycles of 1' 95 °C, 1' 
50°C and 2' 7 2^*0 using primers FR-Pc-14 5 5' GOT GCA ATA CAA ATC TAA 
TTT CAT CAC TAA ATC GO 3' {SEQ ID NO: 16) and FR-Pc-146 5' ACT GGT ACC 
ATC ATA ATA CTC ATC CTC CTT C 3' {SEQ ID NO: 14) . The PGR generates a 

20 fragment of about 850 bp encompassing the Pc enhancer (SEQ ID NO: 17) 

containing a upstream Kpn I site and overlap with the 5' side of the 
SAMl promoter (see Example 8) . The PGR fragment was then mixed with a 
PGR fragment of the SAMl promoter generated with primers FR-Psam-143 
and FR-Psam-144 using the pBKS+ clone containg the adjusted SAMl 

25 promoter described in example 8. In a PGR on this mixture the PcSAM 

chimeric promoter was generated using primers FR-Psam-144 and FR-Pc- 
146. The resulting promoter fragment of about 1.3 kb (SEQ ID NO: 18) 
was isolated from a agarose gel after digestion with Kpn I and Nco I 
and then cloned into a high copy cloning vector (pUC28) digested with 
the same enzymes. The promoter fragment was then cut out of this 
vector by digestion with BamHI and Ncol and cloned in front of the 
GUSintron gene as described above in Example 8. The complete Pe-SAM- 
GUS-TPI-II reporter cassette was then cloned into pMOGSOO as described 
for the SAMl-GUS-TPI-II reporter cassette in Example 8. The resulting 

^5 binary vector pMOG14 00 was introduced m Agrojbacterium tujne/aciens 

strain EHA105 for transformation to potato. 
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Example 10 

Construction of the Pc enhancer- 3 5S promoter and fusion to the GUS 
gene 

5 The plastocyanin enhancer (Pc) from Arabidopsis thaiiana Col-0 was 

obtained by PCR (see above} . 

This clone was used as a template for a PCR (30 cycles of 1' 95 °C, 1' 
SOoC and 2' IZ^C; all other PCR reactions described in this part were 
carried out with the same program) using primers FR-Pc-291 5' GTC TTG 

10 TAG AAA TCT AAT TTC ATC ACT AAA TCG G 3' (SEQ ID NO: 19) and FR-PC-146 

5' AGT GGT ACC ATC ATA ATA CTC ATC CTC CTT C 3' (SEQ ID NO : 14) . The 
PCR generates a fragment of about 850 bp encompassing the Pc enhancer 
containing a upstream Kpnl site and overlap with the 5' side of the 
minimal 35S promoter. The minimal 35S -promoter was obtained in a PCR 

15 using pMOG971 as a template (containing the 35S promoter and omega 5' 

UTR) and primers FR-35S-292 5' TTA GAT TTG TAG AAG ACC CTT CCT CTA TAT 
AAG G 3' (SEQ ID NO: 20) and lsl9 (SEQ ID NO: 21), The resulting 
fragment has overlap with the Pc enhancer and contains a internal Ncol 
site at the translation start. The two PCR fragments were then mixed 

20 and a PCR reaction was carried out using primers FR-Pc-146 and lsl9. 

The resulting fragment was then digested with Kpnl and Ncol, isolated 
from a agarose gel and cloned in pUC28 digested with the same enzymes. 
The resulting clone was, subsequently, digested with BamHI and Ncol, 
and the promoter fragment (SEQ ID NO: 22) was isolated from a agarose 

25 gel and cloned upstream of the GUS gene as described in Example 7 . The 

complete reporter cassette was then introduced in the binary vector 
pMOGSOO as described in Example 7. The resulting binary vector 
pMOG1401 was introduced in Agrobacterium tume^acaens strain EHA105 for 
transformation to potato. 

30 



Example 11 

Expression levels and patterns in in vitro grown plants 

Transformed plants were grown up, and leaf samples of fully 
35 regenerated plants were analyzed for GUS expression. In figure 4 the 

analysis of expression in leaf mesophyll, leaf vascular system, stems 
and roots is indicated, A very low level of GUS staining was observed 
in the mesophyll part of leaves of SAMl - transgenic plants although the 
scoring indicates a GUS expression level of 0, 
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BUDAPEST TREATY ON THE INTERNATIONAL 
RECOGNITION OF THE DEPOSIT OF MICROORGANISMS 
FOR THE PURPOSES OF PATENT PROCEDURE 



INTERNATIONAL FORM 



Mogen International N.V. 
Einstein weg 97 
2333 CB LEIDEN 
Nederland 


RECEIPT IN THE CASE OF AN ORIGINAL DEPOSIT 
issued pursuant to Rule 7.1 by the 
IlTTERNATIONAL DEPOSITARY AOTHORnY 
identified at che bottom of this page 


name and address of depositor 








I . IDENTIFICATION OP THE MICROORGAKISK 


Identification reference given by the 
DEPOSITOR: 


Accession number given by the 
INTERNATIONAL DEPOSITARY AUTHORITY: 


E. coli DH5 alpha strain / the plasmid 
pMCXJSOO 


CBS 414.93 


II . SCIEKTIPIC DESCRIPTION AND/OR PROPOSED TAXONOHIC DESIGKATIOM 


The microorganism identified under I above was accompanied by: 


[ X 1 a scientific description 




[ 1 a proposed taxonomic designation 




(mark with a cross where applicable) 




III. RECEIPT AMD ACCEPTANCE 


This International Depositary accepts the microorganism identified under I above, which was 
received by it on Thursday, 12 AugUSt 1993 {date of the original deposit}^ 


IV * RECEIPT OP REQUEST FOR CONVERSION 


The microorganism identified under I above was received by this International Depositary 
Authority on not applicable (date of the original deposit) and a 
request to convert the original deposit to a deposit under the Budapest Treaty was received by 
it on not applicable (date of receipt of request for conversion) 


V . IMTERNATIOHAI. DEPOSITARY AUTHORITY 


Name : Centraalbureau voor Schimmelcultures 


Signature's) of person (s) having the power to 
represent the International Depositary 
Authority or of authorized official (s): 


Address : Oosterstraat 1 
P.O* Box 273 
3740 AG BAARN 
The Netherlands 


drs F.M. van Asma 

Dat e : Friday, 13 August 1993 



Where Rule 6*4 (d) applies, such date is the date on which the status of international 
depositary authority was acquired. 



Form BP/4 (sole page) CBS/9107 
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AMEEIDED CLAIMS 



1. Chimaeric plant promoter, characterized in that it: 
comprises a minimal promoter and transcription-activating 
elements from a set of promoters , which elements have a 
complementary patteim and level of transcription in a plant. 

2. Chimaeric plant promoter according to claim 1, 
characterized in that each of the transcription-activating 
elements do not exhibit an absolute tissue-specificity, but 
mediate transcriptional activation in most plant parts at a 
level of >1% of the level reached in the part of the plant in 
which transcription is most active. 

3 . Chimaeric plant promoter according to claim 1 or 2 , 
characterized in that one promoter of the set of promoters is 
specifically active in green parts of the plant, while the other 
promoter is specifically active in underground parts of the 
plant . 

4. Chimaeric plant promoter according to claim 3, 
characterized in that it is a combination of the ferrodoxine and 
the RolD promoter. 

5. Chimaeric plant promoter of claim 4, characterized in that 
the minimal promoter element is derived from the ferredoxin 
promoter . 

6. Chimaeric plant promoter according to claim 4 or 5 , 
characterized in that the ferredoxin promoter is derived from 
Arahidopsls thaliana. 

7. Chimaeric plant promoter according to claim 6, 
characterized in that it comprises the sequences of SEQ ID NO: 1 
and SEQ ID NO: 2. 

8. Chimaeric promoter according to claim 7, characterized in 
that it comprises the sequence of SEQ ID NO: 3. 



9 . Chimaeric planr promoter according to claim 3 , 
characrerized in that it is a combination of the plastocyanin 
and the S-adenosyl-methionine-l promoter. 

10. Chimaeric plant promoter according to claim 9, 
characterized in that the minimal promoter element is derived 
from the S-adenosyl-methionine-1 promoter. 

11. Chimaeric plant promoter according to claim 9 or 10, 
characterized in that the plastocyanin promoter is derived from 
Arabldopsis thaliana . 

12. Chimaeric plant promoter according to claim 9, 10 or 11, 
characterized in that the S-adenosyl-methionine-1 promoter is 
derived from Arabidopsis thaliana. 

13. Chimaeric plant promoter according to claim 12, 
characterized in that it comprises the sequences of SEQ ID NO: 4 
and SEQ ID NO: 17, 

14 . Chimaeric plant promoter according to claim 13 , 
characterized in that it comprises the sequence of SEQ ID KO: 
21. 

15. Chimaeric gene construct for the expression of genes in 
plants comprising the promoter of any of claims 1-14. 
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SEQUENCE LISTING 



15 



25 



30 



40 



45 



50 



60 



. 1) GENERAL INFORMATION: 

(i) APPLICANT: 

(A) NAME: Zeneca MOGEN 

(B) STREET: Emstemweg 97 

(C) CITY: Leiden 

(E) COUNTRY: The Netherlands 

(F) POSTAL CODE (ZIP) : 2333 CB 

(G) TELEPHONE: (31) 71-5258282 

(H) TELEFAX: (31) 71-5221471 



(ii) TITLE OF INVENTION: New constitutive plant promoters 
(iii) NUMBER OF SEQUENCES: 22 

20 (iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentin Release #1.0, Version #1.25 (EPO) 



(vi) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: EP 37203912.7 

(B) FILING DATE: 12-DEC-1997 



(2) INFORMATION FOR SEQ ID NO: 1: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 520 base pairs 
^5 (B) TYPE: nucleic acid 

( C ) STRANDEDNESS : doub 1 e 

(D) TOPOLOGY: linear 



(li) MOLECULE TYPE: DNA (genomic) 
(lli) HYPOTHETICAL: NO 
(iii) ANTI -SENSE: NO 

(Xl) SEQUENCE DESCRIPTION: SEQ ID NO : 1: 

GACTGAAGTG TGAAGGTGGA GATTATGTAT TCACTTGTTG ATTTGGTATA CATTCTATGT 60 

-^GGTTCAAT TATTTACGTT ATATAATTAT AATGGAGTAA TTTACAGTAA TTGGGTTAAA 120 

ATGGTTTGAT TCGGTCAGGT TGATACGGTT TGGAAGTTAA ACCCGGCCTA GATATGATGT 180 

TACAACCAGT CCACATCTTT TATGATTTTA GTGGAACAAA CGAAGAGTTA TTTAGACGAT 24 0 

ACAAACAAGG TCCGAATAAG TGTGAGCTGT CCCAAGTAAG ACCACGTAAT ACTCACCTCA 3 00 

ACAAGATAGT GTTCTTAAAG TGTGTCAAAC ACAATCACAC ACACACAAAT CATAAAACAC 360 

.-J^GACGATA ATCCATCGAT CCACAGAATA GACGCCACGT GGTAGATA3G ATTCTCACTA 420 

AAAAGTTCTC ACCTTTTAAT CTTTCTCCAC GCCATTTCCA CAAGCCATAA TCCTCAAAAA 4 80 

TCTCAACTTT ATCTCCCAAA ACACAAATCT AGAAACCATG 520 
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10 



15 



20 



(2} INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 300 base paxrs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
CD} TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(lii) HYPOTHETICAL: NO 

(iii) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

CCCACTACAA TGAATTTGTT CGTGAACTAT TAGTTGCGGG CCTTGGCATC CGACTACCTC 60 

TGCGGCAATA TTATATTCCC TGGGCCCACC GTGAACCCAA TTTCGCCTAT TTATTCATTA 120 

CCCCCATTAA CATTGAAGTA GTCATGATGG GCCTGCAGCA CGTTGGTGAG GCTGGCACAA 130 

25 CTCATCCATA TACTTTCTGA CCGGATCGGC ACATTATTGT AGAAAACGCG GACCCACAGC 24 0 

GCACTTTCCA AAGCGGTGCC GCGTCAGAAT GCGCTGGCAG AAAAAAATTA ATCCAAAAGT 3 00 

30 (2) INFORMATION FOR SEQ ID NO : 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 840 base paxrs 
{B) TYPE: nucleic acid 
35 (C) STRANDEDNESS: double 

{D) TOPOLOGY: linear 

(il) MOLECULE TYPE: DNA (genomic) 

40 (iii) HYPOTHETICAL: NO 

(lii) ANTI-SENSE: NO 

45 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
GGATCCGAGC TTGCATGCCC CCACTACAAT GAATTTGTTC GTGAACTATT AGTTGCGGGC 6 0 

50 CTTGGCATCC GACTACCTCT GCGGCAATAT TATATTCCCT GGGCCCACCG TGAACCCAAT 120 
TTCGCCTATT TATTCATTAC CCCCATTAAC ATTGAAGTAG TCATGATGGG CCTGCAGCAC 180 
GTTGGTGAGG CTGGCACAAC TCATCCATAT ACTTTCTGAC CGGATCGGCA CATTATTGTA 24 0 

55 

GAAAACGCGG ACCCACAGCG CACTTTCCAA AGCGGTGCCG CGTCAGAATG CGCTGGCAGA 3 00 

AAAAAATTAA TCCAAAAGTG ACTGAAGTGT GAAGGTGGAG ATTATGTATT CACTTGTTGA 360 
60 TTTGGTATAC ATTCTATGTA AGGTTCAATT ATTTACGTTA TATAATTATA ATGGAGTAAT 420 
TTACAGTAAT TGGGTTAAAA TGGTTTGATT CGGTCAGGTT GATACGGTTT GGAAGTTAAA 4 80 

CCCGGCCTAG ATATGATGTT ACAACCAGTC CACATCTTTT ATGATTTTAG TGGAACAAAC 54 0 

65 
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25 



60 



65 



GAAGAGTTAT TTAGACGATA CAAACAAGGT CCGAATAAGT GTGAGCTGTC CCAAGTAAGA 6 00 

CCACGTAATA CTCACCTCAA CAAGATAGTG TTCTTAAAGT GTGTCAAACA CAATCACACA 660 

CACACAAATC ATAAAACACA AAGACGATAA TCCATCGATC CACAGAATAG ACGCCACGTG 720 

GTAGATAGGA TTCTCACTAA AAAGTTCTCA CCTTTTAATC TTTCTCCACG CCATTTCCAC 780 

AAGCCATAAT CCTCAAAAAT CTCAACTTTA TCTCCCAAAA CACAAATCTA GAAACCATGG 84 0 

(2) INFORMATION FOR SEQ ID NO: 4: 



(i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 477 base pairs 

(B) TYPE: nucleic acid 
CO STRANDEDNESS : double 
(D) TOPOLOGY: linear 

20 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iii) ANTI- SENSE: NO 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Arabidopsxs thaliana 



30 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4; 

GGATCCTGCA GCGATTTCAT TTTAGATTCT CAAAAATATT CTCAGATGTG TGGGATTTGA 60 

GTAGAGTTTA TGTTGCGTTG GCATGATTTG AATAGTATGC AAGATTTTTG AGATTTTGCA 120 

35 

TTCGTTCATG TGTGTATGTG TGATTGTAGC TTGATATGAT TTAACCTGTT AGTTAAATGT 180 

GCATAGACAA TAAGTAACAT ACGAAGCGAG TCACTAAGCA TAAGAGTCAA CTTGTTTTGC 24 0 

40 TGAAAAGATA TCACTTATGA TTTTCGAATC ATTTTAGCTT TTTTGTCACT TGAGCTTAAT 3 00 

GATTCTTCTG AAATTCGATT CTTTGTTTGG TTTATGTCAC ATTCTTTAGA ATTGAGAATC 3 60 

TAAGAAATGC TTACAGGATA TGGTGAAACT ATTCTTTTAA GATAGCATGA TGCTTCTTTT 420 

45 

ATGATTCTAC AGTGGCTAAG TCATTTTTTT TTTGTTCTAT TCTTTGTAGC ACCATGG 477 
(2) INFORMATION FOR SEQ ID NO: 5: 

50 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 9 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(li) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 

(XX) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
AGATTTGTAT TGCAGCGATT TCATTTTAG 2 9 
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(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 20 base pairs 
{B) TYPE: nucleic acid 

(C) STRANDEDNESS : Single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
dii] HYPOTHETICAL: NO 



15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

ATCTGGTCAC AGAGCTTGTC 2 0 

(2) INFORMATION FOR SEQ ID NO: 7: 

20 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 5 base pairs 

(B) TYPE: nucleic acid 
{C) STRANDEDNESS: Single 

25 (D) TOPOLOGY: linear 

{ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 

30 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

35 GTCTCCATGG TGCTACAAAG AATAG 2 5 

(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 
40 (A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

45 (ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

50 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 
CGGGATCCTG CAGCGATTTC ATTTTAG 2 7 

55 (2) INFORMATION FOR SEQ ID NO: 9: 

tl) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 
60 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

;ii) MOLECULE TYPE: cDNA 
65 111) HYPOTHETICAL: NO 
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(xi) SEQUENCE DESCRIPTION: SEO ID NO: 9: 

5 

ACATGAACGA ATGCAAAATC TC 2 2 

(2) INFORMATION FOR SEQ ID NO: 10: 

10 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : Single 

(D) TOPOLOGY: linear 



15 



20 



25 



35 



(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
AGATTTTGCA TTCGTTCATG TG 22 
(2) INFORMATION FOR SEQ ID NO: 11: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 22 base pairs 
30 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
{ D ) TOPOLOGY : 1 inear 



(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 



40 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

TGTAAGCATT TCTTAGATTC TC 22 
(2) INFORMATION FOR SEQ ID NO: 12: 

45 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
50 ( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

55 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

60 AAGAAATGCT TACAGGATAT GG 22 

^2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 
65 (A) LENGTH; 34 base pairs 
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(B) TYPE: nucleic acid 

(C) STKANDEDNESS : single 

(D) TOPOLOGY: linear 

(li) MOLECULE TYPE: cDNA 
(ill) HYPOTHETICAL: NO 



(XI) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
GACAAGCTTG ATCCCATGGT GCTACAAAGA ATAG 
(2) INFORMATION FOR SEQ ID NO: 14: 

(l) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(li) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
AGTGGTACCA TCATAATACT CATCCTCCTT C 
C2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 37 base pairs 

(B) TYPE: nucleic acid 
{C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

{ii) MOLECULE TYPE: cDNA 

(ill) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 
CGAAGCTTTA CAAATCTAAT TTCATCACTA AATCGGA 37 
(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 5 base pairs 

(B) TYPE: nucleic acid 
{C) STRANDEDNESS: Single 
(D) TOPOLOGY: linear 

{11} MOLECULE TYPE: cDNA 

(ill) HYPOTHETICAL: NO 

(XI) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

3CTGCAATAC AAATCTAATT TCATCACTAA ATCGG 3 5 
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(2) INFORMATION FOR SEQ ID NO: 17; 

ix) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 876 base pairs 

(B) TYPE: nucleic acxd 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(li) MOLECULE TYPE: DNA (genomic) 

(iii} HYPOTHETICAL: NO 

(iii) ANTI -SENSE: NO 

15 (vi) ORIGINAL SOURCE: 

{A) ORGANISM: Arabidopsis chaliana 



10 



20 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

ATCATAATAC TCATCCTCCT TCTCAAGGTT CGTACGTATT ATCAATATCT AGTATATACT 60 

TGTCTTTGTT CTATGCTTTA TATCATCATT TTATGACAAA AAATGATTAA GGTCTTAGTT 120 

25 AATGATTATG TATATGTGAA ACTTATATTT AGGGGCACAA TTTAATTTCG TATGATAATT 180 

GTCTAGTTAG CTTTATGTAC TTATCATAAA AACCTTAGTG TTTATCGCAA TACTTTTCAA 24 0 

ATATAGTGTA GAATCATAAT GGTCCCACTG TCATTATGTT TGATGCAAAT CTATTTGGAT 300 

30 

TTTGTTGGAT AATAAACCGA TGACGTGGAC CAGACCAGTA GCTATAAGAT TTGGTTCACA 360 

TAGAAATTTT TTATAAGATA ATGTATCTAG GTTTGCTTAT GATTATACAT GTGATATTTA 420 

35 ATACATGGCA CAGGTTCGTC GAGTTTCACA GCCATAGGTA CAATAGAAGG CAAATTCGAT 480 

TGTGGTTATC TGGTAAAAGT TAAGTTGGGC TCAGAGATTC TTAACGGCGT TCTTTATCAT 54 0 

TCGGCCCAGC CCGGCCCATC ATCATCTCCA ACCGCTGTTC TAAACAATGC CGTTGTACCT 600 

40 

TATGTTGAAA CTGGGAGGAG ACGGCGTCGT TTAGGTAAAA GACGAAGAAG CAGACGCAGA 660 

GAAGATCCGA ATTACCCGAA ACCGAACCGG AGCGGTTACA ATTTCTTCTT TGCTGAGAAA 72 0 

45 CATTGCAAGC TCAAATCACT TTATCCCAAC AAGGAGAGAG AGTTTACGAA ACTTATCGGA 780 

GAATCGTGGA GCAATCTCTC TACCGAAGAA CGAATGGTAA CAAATTATCT TTTAAACCGT 84 0 

TACCGATTTA GTGATGAAAT TAGATTTGTA GTAAAT 876 
(2) INFORMATION FOR SEQ ID NO: 18: 



50 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 57 base pairs 
55 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



60 



65 



(ii) MOLECULE TYPE: DMA (genomic) 
(iii) HYPOTHETICAL: NO 
(Hi) ANTI -SENSE: NO 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

GGATCCCCGG GTACCATCAT AATACTCATC CTCCTTCTCA AGGTTCGTAC GTATTATCAA 60 

TATCTAGTAT ATACTTGTCT TTGTTCTATG CTTTATATCA TCATTTTATG ACAAAAAATG 120 

ATTAAGGTCT TAGTTAATGA TTATGTATAT GTGAAACTTA TATTTAGGGG CACAGTTTAA 180 

TTTCGTATGA TAATTGTCTA GTTAGCTTTA TGTACTTATC ATAAAAACCT TAGTGTTTAT 24 0 

CGCAATACTT TTCAAATATA GTGTAGAATC ATAATGGTCC CACTGTCATT ATGTTTGATG 300 

CAAATCTATT TGGATTTTGT TGGATAATAA ACCGATGACG TGGACCAGAC CAGTAGCTAT 360 

15 AAGATTTGGT TCACATAGAA ATTTTTTATA AGATAATGTA TCTAGGTTTG CTTATGATTA 420 

TACATGTGAT ATTTAATACA TGGCACAGGT TCGTCGAGTT TCACAGCCAT AGGTACAATA 480 

GAAGGCAAAT TCGATTGTGG TTATCTGGTA AAAGTTAAGT TGGGCTCAGA GATTCTTAAC 54 0 

20 

GGCGTTCTTT ATCATTCGGC CCAGCCCGGC CCATCATCAT CTCCAACCGC TGT7CTAAAC 600 

AATGCCGTTG TACCTTATGT TGAAACTGGG AGGAGACGGC GTCGTTTAGG TAPlAAGACGA 660 

25 AGAAGCAGAC GCAGAGAAGA TCCGAATTAC CCGAAACCGA ACCGGAGCGG TTACAATTTC 720 

TTCTTTGCTG AGAAACATTG CAAGCTCAAA TCACTTTATC CCAACAAGGA GAGAGAGTTT 780 

ACGAAACTTA TCGGAGAATC GTGGAGCAAT CTCTCTACCG AAGAACGAAT GGTAACAAAT 84 0 

30 

TATCTTTTAA ACCGTTACCG ATTTAGTGAT GAAATTAGAT TTGTATTGCA GCGATTTCAT 900 

TTTAGATTCT CAAAAATATT CTCAGATGTG TGGGATTTGA GTAGAGTTTA TGTTGCGTTG 960 

35 GCATGATTTG AATAGTATGC AAGATTTTTG AGATTTTGCA TTCGTTCATG TGTGTATGTG 1020 

TGATTGTAGC TTGATATGAT TTAACCTGTT AGTTAAATGT GCATAGACAA TAAGTAACAT 1080 

ACGAAGCGAG TCACTAAGCA TAAGAGTCAA CTTGTTTTGC TGAAAAGATA TCACTTATGA 114 0 

40 

TTTTCGAATC ATTTTAGCTT TTTTGTCACT TGAGCTTAAT GATTCTTCTG AAATTCGATT 1200 

CTTTGTTTGG TTTATGTCAC ATTCTTTAGA ATTGAGAATC TAAGAAATGC TTACAGGATA 1260 

45 TGGTGAAACT ATTCTTTTAA GATAGCATGA TGCTTCTTTT ATGATTCTAC AGTGGCTAAG 1320 

TCATTTTTTT TTTGTTCTAT TCTTTGTAGC ACCATGG 1357 
(2) INFORMATION FOR SEQ ID NO : 19: 

50 

(i) SEQUENCE CHARACTERISTICS: 
{A> LENGTH: 34 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : Single 
55 (DJ TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

tiii) HYPOTHETICAL: NO 



60 



{XI ) SEQUENCE DESCRIPTION: SEQ ID NO : 19: 
65 GTCTTGTACA AATCTAATTT CATCACTAAA TCGG 34 
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{2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 



60 



15 (xi) SEQUENCE DESCRIPTION; SEQ ID NO: 20: 

TTAGATTTGT ACAAGACCCT TCCTCTATAT AAGG 
(2) INFORMATION FOR SEQ ID NO : 21: 

20 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base parrs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
25 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 

30 

(Xi> SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

35 TTCCCAGTCA CGACGTTGT 

(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 
40 (A) LENGTH; 1006 base pairs 

(B) TYPE: nucleic acid 
{C) STRANDEDNESS: double 
(D) TOPOLOGY: linear 

45 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 

50 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 
55 GGATCCCCGG GTACCATCAT AATACTCATC CTCCTTCTCA AGGTTCGTAC GTATTATCAA 6 0 

TATCTAGTAT ATACTTGTCT TTGTTCTATG CTTTATATCA TCATTTTATG ACAAAAAATG 120 
ATTAAGGTCT TAGTTAATGA TTATGTATAT GTGAAACTTA TATTTAGGGG CACAGTTTAA 180 
TTTCGTATGA TAATTGTCTA GTTAGCTTTA TGTACTTATC ATAAAAACCT TAGTGTTTAT 240 
CGCAATACTT TTCAAATATA GTGTAGAATC ATAATGGTCC CACTGTCATT ATGTTTGATG 300 
65 CAAATCTATT TGGATTTTGT TGGATAATAA ACCGATGACG TGGACCAGAC CAGTAGCTAT 3 60 
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AAGATTTGGT TCACATAGAA ATTTTTTATA 
TACATGTGAT ATTTAATACA TGGCACAGGT 

5 

GAAGGCAAAT TCGATTGTGG TTATCTGGTA 
GGCGTTCTTT ATCATTCGGC CCAGCCCGGC 
10 AATGCCGTTG TACCTTATGT TGAAACTGGG 
AGAAGCAGAC GCAGAGAAGA TCCGAATTAC 
TTCTTTGCTG AGAAACATTG CAAGCTCAAA 

15 

ACGAAACTTA TCGGAGAATC GTGGAGCAAT 
TATCTTTTAA ACCGTTACCG ATTTAGTGAT 
20 ATATAAGGAA GTTCATTTCA TTTGGAGAGG 
AACAAACAAC AAACAACATT ACAATTACTA 



AGATAATGTA TCTAGGTTTG CTTATGATTA 420 

TCGTCGAGTT TCACAGCCAT AGGTACAATA 480 

AAAGTTAAGT TGGGCTCAGA GATTCTTAAC 540 

CCATCATCAT CTCCAACCGC TGTTCTAAAC 60 0 

AGGAGACGGC GTCGTTTAGG TAAAAGACGA 660 

CCGAAACCGA ACCGGAGCGG TTACAATTTC 720 

TCACTTTATC CCAACAAGGA GAGAGAGTTT 780 

CTCTCTACCG AAGAACGAAT GGTAACAAAT 840 

GAAATTAGAT TTGTACAAGA CCCTTCCTCT 900 

ACACGTATTT TTACAACAAT TACCAACAAC 960 

TTTACAATTA CCATGG 1006 
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